Association Mining in Gradually Changing Domains

نویسندگان

  • Antonin Rozsypal
  • Miroslav Kubat
چکیده

Association mining explores algorithms capable of detecting frequently co-occurring items in transactions. A transaction can be identified with a market basket—a list of items a customer pays for at the checkout desk. In this paper, we explore a framework for the detection of changes in the buying patterns, as affected by fashion, season, or the introduction of a new product. We present several versions of our algorithm and experimentally examine their behaviors in domains with gradually changing domains. Introduction Let a database consist of transactions T1, T2, . . . , TN such that ∀i, Ti ⊆ I , where I is a set of items. Let an itemset, X , be defined as a group of items such that ∃i,X ⊆ Ti. A support of itemset X is the number of transactions, Ti, containing this itemset. Suppose a user of an associationmining algorithm submits a minimum support, θ. A popular research issue is how to detect all itemsets whose support is at least equal to θ. We will call them high-support itemsets. In a classical application, the transactions are identified with lists of items a customer pays for at the checkout desk (market baskets). A supermarket then places associated items on neighboring shelves, advertises them in the same catalogues, and avoids discounts on more than one member of the same itemset. However, the paradigm extends well beyond the realm of supermarket data. In the Internet environment, a transaction may consist of hyperlinks pointing to a web page, and high-support itemsets then signal associations among web sites (Noel, Raghavan, & Chu 2001). In a medical application, each transaction can summarize a patient’s history, and association mining may seek co-occurring symptoms or ailments. Here, we are interested in domains where the list of highsupport itemsets is subject to changes in time, for instance as a result of fashion or seasonal impacts (Cheung & Han 1996; Pitkow 1997; Raghavan & Hafez 2000). We will assume the framework of block evolution (Ganti, Gehrke, & Ramakrishnan 2000) where a block of market baskets is periodically added to an existing database as individual stores report their daily business. The task is to adapt to this change Copyright c © 2003, American Association for Artificial Intelligence (www.aaai.org). All rights reserved. and to measure the accuracy of this adaptation. To this end, we recently developed a novel algorithm (Rozsypal & Kubat 2002) and investigated its behavior in domains where the change is abrupt. However, this may be too much of a simplification. More often than not, the “Winter” buying patterns only gradually replace the “Fall” patterns. This is the scenario we address. Somewhat surprisingly, it turns out that, under this circumstance, different operators for change detection and for knowledge update are useful. We report a series of experiments illustrating the point.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evolving association streams

The increasing bulk of data generation in industrial and scientific applications has fostered practitioners’ interest in mining large amounts of unlabeled data in the form of continuous, high speed, and time-changing streams of information. An appealing field is association stream mining, which models dynamically complex domains via rules without assuming any a priori structure. Different from ...

متن کامل

Mining Association Rules in Geographical Spatio-temporal Data

For the sake of environmental change monitoring, a huge amount of geospatial and temporal data have been acquired through various networks of monitoring stations. For instance, daily precipitation and air temperature are observed at meteorological stations, and MODIS images are regularly received at satellite ground stations. However, so far these massive raw data from the stations are not full...

متن کامل

Multi-objective Numeric Association Rules Mining via Ant Colony Optimization for Continuous Domains without Specifying Minimum Support and Minimum Confidence

Currently, all search algorithms which use discretization of numeric attributes for numeric association rule mining, work in the way that the original distribution of the numeric attributes will be lost. This issue leads to loss of information, so that the association rules which are generated through this process are not precise and accurate. Based on this fact, algorithms which can natively h...

متن کامل

A mining method for tracking changes in temporal association rules from an encoded database

Mining of association rules has become vital in organizations for decision making. The principle of data mining is better to use complicative primitive patterns and simple logical combination than simple primitive patterns and complex logical form. This paper overviews the concept of temporal database encoding, association rules mining. It proposes an innovative approach of data mining to reduc...

متن کامل

A Regression-Based Approach for Improving the Association Rule Mining through Predicting the Number of Rules on General Datasets

Association rule mining is one of the useful techniques in data mining and knowledge discovery that extracts interesting relationships between items in datasets. Generally, the number of association rules in a particular dataset mainly depends on the measures of ’support’ and ’confidence’. To choose the number of useful rules, normally, the measures of ’support’ and ’confidence’ need to be trie...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003